Measuring the Compositionality of Collocations via Word Co-occurrence Vectors: Shared Task System Description
نویسندگان
چکیده
A description of a system for measuring the compositionality of collocations within the framework of the shared task of the Distributional Semantics and Compositionality workshop (DISCo 2011) is presented. The system exploits the intuition that a highly compositional collocation would tend to have a considerable semantic overlap with its constituents (headword and modifier) whereas a collocation with low compositionality would share little semantic content with its constituents. This intuition is operationalised via three configurations that exploit cosine similarity measures to detect the semantic overlap between the collocation and its constituents. The system performs competitively in the task.
منابع مشابه
Identifying Collocations to Measure Compositionality: Shared Task System Description
This paper describes three systems from the University of Minnesota, Duluth that participated in the DiSCo 2011 shared task that evaluated distributional methods of measuring semantic compositionality. All three systems approached this as a problem of collocation identification, where strong collocates are assumed to be minimally compositional. duluth1 relies on the t-score, whereas duluth-2 an...
متن کاملShared Task System Description: Measuring the Compositionality of Bigrams using Statistical Methodologies
The measurement of relative compositionality of bigrams is crucial to identify Multi-word Expressions (MWEs) in Natural Language Processing (NLP) tasks. The article presents the experiments carried out as part of the participation in the shared task ‘Distributional Semantics and Compositionality (DiSCo)’ organized as part of the DiSCo workshop in ACLHLT 2011. The experiments deal with various c...
متن کاملDetermining Compositionality of Word Expressions Using Word Space Models
This research focuses on determining semantic compositionality of word expressions using word space models (WSMs). We discuss previous works employing WSMs and present differences in the proposed approaches which include types of WSMs, corpora, preprocessing techniques, methods for determining compositionality, and evaluation testbeds. We also present results of our own approach for determining...
متن کاملShared Task System Description: Frustratingly Hard Compositionality Prediction
We considered a wide range of features for the DiSCo 2011 shared task about compositionality prediction for word pairs, including COALS-based endocentricity scores, compositionality scores based on distributional clusters, statistics about wordnet-induced paraphrases, hyphenation, and the likelihood of long translation equivalents in other languages. Many of the features we considered correlate...
متن کاملExemplar-Based Word-Space Model for Compositionality Detection: Shared Task System Description
In this paper, we highlight the problems of polysemy in word space models of compositionality detection. Most models represent each word as a single prototype-based vector without addressing polysemy. We propose an exemplar-based model which is designed to handle polysemy. This model is tested for compositionality detection and it is found to outperform existing prototype-based models. We have ...
متن کامل